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ABSTRACT 


Three neural network processing approaches in a direct numerical optimization model reduction 
scheme are proposed and investigated. 


INTRODUCTION 


Large structural systems, such as large space structures, offer new challenges to both structural 
dynamicists and control engineers. One such challenge is that of dimensionality. Indeed these distributed 
parameter systems can be modeled either by infinite dimensional mathematical models (typically partial 
differential equations) or by high dimensional discrete models (typically finite element models) often 
exhibiting thousands of vibrational modes usually closely spaced and with little, if any, damping. Clearly, some 
form of model reduction is in order, especially for the control engineer who can actively control but a few of the 
modes using system identification based on a limited number of sensors. Inasmuch as the amount of "control 
spillover” (in which the control inputs excite the neglected dynamics) and/or "observation spillover" (where 
neglected dynamics affect system identification) is to a large extent determined by the choice of a particular 
reduced model (RM), the way in which this model reduction is carried out is often critical. 

Different techniques to obtain RM's have been proposed by various authors. While they are based on 
the same philosophy of retaining only those modes which play a significant role, they differ in the way the 
roles of the modes are quantified. Among these techniques we mention: (i) Modal Truncation; (ii) Balanced 
Controller Reduction; (iii) Component Cost Analysis; (iv) Optimal Projection Conditions; (v) Energy Based 
Model Reduction (also referred to as Modal Performance Tracking); (vi) Subsystem Balancing. (See [1] for 
references on methods (ii-iv), [2] and the references therein for method (v) and [3] for (vi).) 

Model reduction can also be viewed as providing an answer to the question: What are the m < n 
linear combinations of the n < °° states of the full model which best describe the behavior of the system? 

The various techniques only differ in the way "best" is defined. As such, model reduction is an optimization 


1 The work of both authors was supported in part by NASA-Lewis Research Center under Giant NAG3- 1 174. 
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problem. In fact, most model reduction schemes first attempt to find an analytical solution to the optimization 
problem, using necessary optimality conditions to obtain one or several equations to be satisfied by the 
solution and which can then be solved in an iterative numerical scheme. Viewed in this light, most currently 
available model reduction schemes suffer from three shortcomings: (i) they are restricted to optimality criteria 
for which a (partial) analytical solution to the optimization problem can be found, (ii) being based on 
necessary conditions, they cannot guarantee that the solution so obtained is the actual optimum sought, and 
(iii) the iterative numerical construction of the solution can be a formidable task. Recently, to alleviate the 
above shortcomings, we proposed to carry model reduction by direct numerical solution of the optimization 
problem [4]. In this paper we propose and investigate the use of neural network processing methods to carry ; 

out this direct optimization. First we review the direct numerical optimization approach proposed in [4], | 

j 


DIRECT NUMERICAL OPTIMIZATION METHOD 


i 


Consider the n-th order linear time-invariant state space model of a large structural system j 

| 
I 

(la) ! 

(lb) j 

Here x, u and y are the n, r and p-dimensional state, input and output vectors respectively, A, B, and C are I 

constant matrices of appropriate dimensions and the system is assumed to be completely controllable. Model : 

reduction consists of finding a model of order m<n 


x = A x + B u 
y = Cx . 


Xm = A m x m + B m u 

y» " C m X m • 


(2a) 

(2b) 


Here x m and y m are m and p-dimensional state and output vectors, while A m , B m and C m are constant 
matrices of appropriate dimensions, which "best approximates" the full order model (la,b). 

In this paper, as in [4], we restrict ourselves to model reduction schemes based on an integral-square- 
error performance index (in particular to the optimal projection method of Hyland and Bernstein), [1,5], but the 
methodology is applicable to other schemes as well. We are thus interested in determining matrices A m , B m 

and C m which minimize 


! 


J(A m ,B m ,C nj ) = 1 i m E[(y-y m ) T R(y-y m )] 

when u is white noise with intensity V. In (3) E[ ] denotes expected value and R is a positive 
definite weighting matrix. 


Introducing the augmented system of order n+m 
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x, = A a x a + B a u 

y. — ^a > 


where 


X a = 


y. =y-y, 


fA ° I B ' 

" A,- [» A.} B * - [ B J 


A 0 I B I 

» A • B - ' R ■ C - =[ C -C.]. (5) 


the optimality criterion (3) is written as 


J(A m ,B nl ,C II1 ) - 1 i m E[y, T R y,] = tr[Q a R a ] 

t — ► oo 


where Q a is the positive semidefinite solution of the Lyapunov equation 


0 = A. Q a + Q, A] + B a V B a 


R a = C, R C] 


The model reduction problem has been recast as the optimization problem: 
min tr[Q a R a ] 

subject to 0 = A a Q a + Q a A] + B a V Bj . 

Similar results hold for other integral-square-error performance indices (see [6] for example). 
Introducing the partition 


_ [Q. Q. 


Q.J’ 

compatible with partitions (5), the constraint (7) is decomposed as 

0 = A Q, + Q t A t + B V B T 
0 = AQ 2 + Q 2 A l + BVB a 
0 = A m Q m + Q m A* + B n V B* . 


(11a) 

(Hb) 

(He) 


Note, from (11a), that Q t is completely determined from knowledge of the full model. Thus expanding the 
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objective function in (9) and neglecting the constant term involving Q p the optimization problem (9,7) is 
rewritten as 

min {tr[Q m C* RQ] - 2 tr[Qj C T RCJ} (12) 

subject to 0 = A Q 2 + Q 2 A* + B V B* (11b) 

0-A.Q.+Q„A t . + B,VB:. Ole) ! 

Note that all of the above manipulations were aimed at transforming the statement of the optimization | 

problem and not at obtaining a (partial) solution. Thus, this approach does indeed alleviate the first two of the | 

shortcomings mentioned earlier since it is not restricted to particular optimality criteria (although it was 
illustrated here for a particular one), and it is guaranteed to yield at least a local minimum. In addition, we can 
choose the numerical optimization scheme which is best adapted to the particular optimization problem which 
the RM must satisfy. In [4] some promising preliminary results for a classic and somewhat pathological 
example [5,7] and the use of a generalized reduced gradient algorithm [8] were presented. Here we investigate 
the feasibility of using neural network processing methods to solve the optimization problem (9,7) or 
(12,1 lb,c). Improving the computational efficiency for large problems through massive parallelization is the 
motivation for using these methods, thus alleviating the third shortcoming. 


NEURAL NETWORK PROCESSING METHOD 


The neural network processing method is an extension of the Hopfield neural network model [9] which 
has been successfully used to solve combinatorial optimization problems such as the Travelling Salesman 
problem. Developed by W. Jeffrey and R. Rosner to solve a class of ill posed inverse problems, the neural 
network processing method [10] is a reformulation of the Hopfield model. Our aim is to apply this 
methodology to the model reduction problem. We begin with some details of the method. 

Consider a network, possibly modeled by analog electronic components, the energy E of which at any 
time can be expressed as a quadratic function of its state x as 

E(x) = - x W x + 2T t x . 03) 

E(x) can be regarded as the objective function in an optimization problem for which x is the design variable. 
Matrix W and vector T are constant valued and arise from the mapping of the optimization problem into the 
above format. 

The change in the energy function resulting from a discrete step, i.e. a change Ax k in a single element 
x k of x, can be shown to be given as 

AE k = ( -2 w k x + 2 T k - w,* Ax k ) Ax k (14) 
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where Ax k = X k ( -2 w k x + 2 T k ), w k being the k-th row of W, w,* the k,k-th element of W, T k the k- 
th element of T and X k the step size for Ax k . The parallel processing capabilities come into play here since 
all the elements of x can be changed simultaneously, increasing the computational speed. 

We now continue changing x in this manner until AE k = 0 for all k. The state so obtained represents a 
minimum energy state. By adjusting the size of X k we can show that AE k <0 for all Ax k . Since we can 
reduce equation (14) to 

AE k = ( 7- - w kk )(Ax k ) 2 , (15) 

A k 

then AE k = 0 when for X k < 0. 

^k 

Hopfield and Tank [9] showed that the stable state reached is a minimum for the optimization problem. 
Jeffrey and Rosner [10] extended this formulation by allowing for higher order (i.e. non quadratic) terms to be 
included in the energy function when necessary. The details of their formulation, being similar to the analysis 
just presented, are not given here. 

Note that the neural network processing method of Jeffrey and Rosner is restricted to unconstrained 
optimization problems. Befbre applying it to the model reduction application at hand, the constrained 
optimization problem (9,7) or (12,1 lb, c) must first be recast as an unconstrained one. We now present three 
ways in which this can be accomplished: first a penalty function approach, then by solving the problem as a 
sequence of unconstrained problems in a multi-stage approach, and finally a substitution approach in which the 
constraint equation is solved and substituted into the objective function. 


PENALTY FUNCTION APPROACH 


The penalty function approach incorporates all of the constraints into the energy function via penalty 
terms. The problem becomes an unconstrained problem for the penalty function. This is accomplished in two 
steps: 


1 . The equality constraints (7) or (1 lb,c) are incorporated into the energy function to create a modified 
Lagrangian or penalty function [11], that is: 


E(x) = F(x) + X[<t>h5 (x) + Y ljj hyfx)] (16) 

uj 

where F(x) is the objective function of the constrained problem ( tr[Q,R,] or {tr[Q m C* R C m ] - 
2tr[Qj C T R C m ]} for the problem at hand), 4> and y are penalty parameters. I,j are Lagrange multipliers and 
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h,j is the ij-th element of the equality constraint, 

2. The underlying inequality constraint Q, >_0 is enforced by factoring Q, as the product of an upper 
triangular matrix M a , partitioned as 



and its transpose. In (17) Mj and M m are upper triangular matrices such that Q 2 = M 2 and Q m = 

M m . These are substituted into the energy function so that the vector of design variables x is made up of 
(i) elements of A m , (ii) elements of B m , (iii) elements of M 2 , (iv) non zero (i.e. upper triangular) 
elements of M m , and (iv) ly the Lagrange multipliers. 

The Modified Differential Multiplier Method (MDMM), proposed by Platt [12] for use in neural 
network processing, is then used to solve the problem. This essentially amounts to applying gradient ascent on 
the Lagrange multipliers while applying gradient descent on all of the other design variables. 


MULTI-STAGE APPROACH 


The multi stage approach is loosely based on a model reduction algorithm proposed by Wilson [13]. It 
is simply the following algorithm: ;; 

F 

1. Pick initial guesses for matrices A m and B m . i 

2. Calculate Q 2 and Q m . = 

p 

3. Minimize the objective function using the neural network processing method with elements of B m as I 
the only design variables. 

4. Update the A m matrix using A m = Qj A Q 2 Q‘J. (This is analagous to the necessary 
condition for an optimum used by Wilson [13].) 

5. Go to step 2 until the objective function stops changing from iteration to iteration 

Note that in this approach, the minimization problem of step 3 is an unconstrained problem. Thus the model 
reduction problem is solved as a sequence of unconstrained optimization problems. 
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suBsimmoN approach 


In the substitution approach the Q 2 and Q m matrices, as solutions of (1 lb,c), are functions of A ro and 
B m which are substituted in the objective function of (12) to yield an unconstrained problem where the 
elements of A,,, and B m are the only design variables. Neural network processing is then used with the energy 
function E = tr[Q.(A M> BJ C* R CJ - 2 tr[Q*(A n ,BJ C T R CJ. 


RESULTS 


In all examples considered we assumed that actuators and sensors were collocated so that B = C T and 
B m = C ro T , and, without loss of generality, that R and V are identity matrices of appropriate dimensions. 

All three methods presented solved only problems of a very limited scope: all methods were able to 
solve very small real eigenvalue problems, but all showed an inability to solve problems of a practical size and 
nature. For example all three methods yielded an optimal solution for the following very simple problem 
considered in [4] (and given here with its solution) 



-.005 

-.99 ' 


* 1 ' 

A = 



, B = 



.99 

-5000 


100 


A* = [-4998.1] , B m 


[100.0] , obj = -10004.0 . 


The point of interest of this example is that some model reduction techniques yield a solution corresponding to 
a maximum rather than the minimum [5,7], 

a. Penalty Function Approach 


The penalty function approach exhibited poor performance in solving model reduction problems. It was 
able to solve problems in which the original A matrix was 4x4 and the reduced matrix A,,, was 2x2; however, 
this was the largest problem that we were able to solve using this method. The encouraging fact is that the 
method did yield good, possibly optimal, solutions to a few small problems with complex eigenvalues. For 
example the following problem (given here with its solution) was solved successfully 



-.1 

-10 

0 

0 ' 


T 


10 

-.1 

0 

0 


3 

A = 





. B = 



0 

0 

-.5 

-15 


-2 


0 

0 

15 

-.5 


3 


'-.124 -10.0751 

'-2.867' 

9.924 -.0794 J ’ B “ “ 

-1.392 


obj 


-253.4 . 
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Difficulties with this approach were due to a lack of good guiding principles in setting step size and penalty 
parameters, a slow convergence, and an apparent large number of local minima. 

fr Multi Sta ge Approach 

The multi stage approach exhibited a slightly different behavior. Since the traditional optimization 
portion of the algorithm which was carried out using ne ural n etwork processing involved a much smaller 
problem, the method was able to solve overall larger problems. However, the approach would not solve 
problems with complex eigenvalues but would successfully solve problems with strictly real eigenvalues. The 
maximum size of these models were 6 inputs, 6 outputs with 16x16 A matrices. As problems with strictly real I 
eigenvalues have little practical application, this approach was abandoned. { 

c. Substitution Approach 

The substitution approach presented basically the same difficulties as the penalty approach. Although : 
it successfully solved the example given in the penalty function approach subsection above, yielding the same | 

solution, it showed limitations in that it was unable to solve problems with A matrices bigger than 4x4. ‘ 


CONCLUDING REMARKS 



The results obtained so far have not lived up to our expectations when we embarked on this 
investigation. In all fairness it must be pointed out that the difficulties encountered do not appear to be a result 
of the neural network processing approach. Parallel investigations using a standard optimization software 
package [8] were also disappointing. The difficulty appears to stem from the fact that the objective function 
has apparently a large number of local minima. In particular, it appears that any reasonable starting point is a 
local minimum! 

A positive result in our lack of success in solving practical sized problems is the development of a type 
of modal cost analysis based on the objective function developed for the optimization methods. In this method 
we transform the system matrices such that the A matrix has 2x2 blocks on the main diagonal, each block 
corresponding to a mode of the structural system, and the B matrix is consistent with these new coordinates. 
Next we calculate the objective function for each 2x2 system individually. The objective values for all of the 
individual (1 mode) reduced models are sorted and the lowest ones are retained. At this time we have not put 
enough time into this approach to make any firm statement about the quality and cost of these solutions. 
However preliminary results are encouraging. We have reduced models with A matrices up to 1 68x 1 68 (the 
JPL/AFAL experiment structure) down to A m matrices of 108x108 yielding excellent results when looking at 
the time response characteristics. We are now looking into this method in more detail to see if this approach 
can be used to obtain directly or aid us in finding optimal reduced models. Results will be reported elsewhere 
as they become available. 
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